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Amendment to the Specification: 

Please amend the specification as follows: 

Please replace the paragraph 265 on pages 70 to 71, with the following amended 

paragraph: 

[0265] A "comparison window", as used herein, includes reference to a segment of any one 
of the number of contiguous positions selected from the group consisting of from 20 to 600, usually 
about 50 to about 200, more usually about 100 to about 150 in which a sequence may be compared 
to a reference sequence of the same number of contiguous positions after the two sequences are 
optimally aligned. Methods of alignment of sequence for comparison are well-known in the art. 
Optimal alignment of sequences for comparison can be conducted, e.g., by the local homology 
algorithm of Smith & Waterman, Adv. AppL Math. 2:482, 1981, by the homology alignment 
algorithm of Needleman & Wunsch, J. Mol. Biol 48:443, 1970, by the search for similarity method 
of person & Lipman, Proc. Natl Acad. Sci. USA 85:2444, 1988, by computerized implementations 
of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software 
Package, Genetics Computer Group, 575 Science Dr., Madison, WI), or by manual alignment and 
visual inspection. Other algorithms for determining homology or identity include, for example, in 
addition to a BLAST program (Basic Local Alignment Search Tool at the National Center for 
Biological Information), ALIGN, AMAS (Analysis of Multiply Aligned Sequences), AMPS 
(Protein Multiple Sequence Alignment), ASSET (Aligned Segment Statistical Evaluation Tool), 
BANDS, BESTSCOR, BIOSCAN (Biological Sequence Comparative Analysis Node), BLIMPS 
(BLocks IMProved Searcher), FASTA, Intervals & Points, BMB, CLUSTAL V, CLUSTAL W, 
CONSENSUS, LCONSENSUS, WCONSENSUS, Smith-Waterman algorithm, DARWIN, Las 
Vegas algorithm, FNAT (Forced Nucleotide Alignment Tool), Framealign, Framesearch, 
DYNAMIC, FILTER, FSAP (Fristensky Sequence Analysis Package), GAP (Global Alignment 
Program), GENAL, GIBBS, GenQuest, ISSC (Sensitive Sequence Comparison), LALIGN (Local 
Sequence Alignment), LCP (Local Content Program), MACAW (Multiple Alignment Construction 
& Analysis Workbench), MAP (Multiple Alignment Program), MBLKP, MBLKN, PIMA (Pattern- 
Induced Multi-sequence Alignment), SAGA (Sequence Alignment by Genetic Algorithm) and 
WHAT-IF. Such alignment programs can also be used to screen genome databases to identify 
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polynucleotide sequences having substantially identical sequences. A number of genome databases 
are available, for example, a substantial portion of the human genome is available as part of the 
Human Genome Sequencing Project (J. Roach, http://wcber.u. Washington. e du/ - roach/human^ 
genome_ progr e ss 2.html) (Gibbo, 1995) . At least twenty-one other genomes have already been 
sequenced, including, for example, M. genitalium (Fraser et al., 1995), M. jannaschii (Bult et al., 
1996), H. influenzae (Fleischmann et al., 1995), R coli (Blattner et al., 1997), and yeast (S. 
cerevisiae) (Mewes et al., 1997), and D. melanogaster (Adams et al., 2000). Significant progress 
has also been made in sequencing the genomes of model organism, such as mouse, C. elegans, and 
Arabadopsis sp. Several databases containing genomic information annotated with some functional 
information are maintained by different organization, and are accessible via the internetrfe* 
exampl e , http://wwwtigr.org/tdb; http://wvvw.gcnetics.wisc.edu; http://genomc 
www.stanford.edu/ - ball; http://hiv web.Ianl.gov; http://www.ncbi.nlm.nih.gov; 
http://www. e bi.ac.uk; http://Pasteur.fr/other/biology; and http:// wvw.g e nome.wi.mit.edu . 

Please replace the paragraph 266 on pages 71 to 72, with the following amended 

paragraph: 

[0266] One example of a useful algorithm is BLAST and BLAST 2.0 algorithms, which are 
described in Altschul et al., Nuc. Acids Res. 25:3389-3402, 1977, and Altschul et al., J. Mol. Biol. 
215:403-410, 1990, respectively. Software for performing BLAST analyses is publicly available 
through the National Center for Biotechnology Information (http://wvwv.ncbi.nlm.nih.go - V)9 . This 
algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words 
of length W in the query sequence, which either match or satisfy some positive-valued threshold 
score T when aligned with a word of the same length in a database sequence. T is referred to as the 
neighborhood word score threshold (Altschul et al., supra). These initial neighborhood word hits 
act as seeds for initiating searches to find longer HSPs containing them. The word hits are extended 
in both directions along each sequence for as far as the cumulative alignment score can be 
increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M 
(reward score for a pair of matching residues; always >0). For amino acid sequences, a scoring 
matrix is used to calculate the cumulative score. Extension of the word hits in each direction are 
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halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved 
value; the cumulative score goes to zero or below, due to the accumulation of one or more negative- 
scoring residue alignments; or the end of either sequence is reached. The BLAST algorithm 
parameters W, T, and X determine the sensitivity and speed of the alignment. The BLASTN 
program (for nucleotide sequences) uses as defaults a wordlength (W) of 1 1, an expectation (E) of 
10, M=5, N=-4 and a comparison of both strands. For amino acid sequences, the BLASTP program 
uses as defaults a wordlength of 3, and expectations (E) of 10, and the BLOSUM62 scoring matrix 
(see Henikoff & Henikoff, Proc. Natl. Acad. Sci. USA 89:10915, 1989) alignments (B) of 50, 
expectation (E) of 1 0, M=5, N= -4, and a comparison of both strands. 

Please replace the paragraph 274 on page 73, with the following amended paragraph: 
[0274] The BLAST programs identify homologous sequences by identifying similar 
segments, which are referred to herein as "high-scoring segment pairs," between a query amino or 
nucleic acid sequence and a test sequence which is preferably obtained from a protein or nucleic 
acid sequence database. High-scoring segment pairs are preferably identified (i.e., aligned) by 
means of a scoring matrix, many of which are known in the art. Preferably, the scoring matrix used 
is the BLOSUM62 matrix (Gonnet et ah, Science 256:1443-1445, 1 992; Henikoff and Henikoff, 
Proteins 17:49-61, 1993). Less preferably, the PAM or PAM250 matrices may also be used (see, 
e.g., Schwartz and Dayhoff, eds., 1978, Matrices for Detecting Distance Relationships: Atlas of 
Protein Sequence and Structure, Washington: National Biomedical Research Foundation). BLAST 
programs are accessible through the U.S. National Library of Medicines-erg^ 
www.ncbi.nlm.nih.gov . 

Please replace the paragraph 291 on page 78, with the following amended paragraph: 
[0291] Figure 5 is a flow diagram illustrating one embodiment of an identifier process 300 
for detecting the presence of a feature in a sequence. The process 300 begins at a start state 302 and 
then moves to a state 304 wherein a first sequence that is to be checked for features is stored to a 
memory 1 1 5 in the computer system 100. The process 300 then moves to a state 306 wherein a 
database of sequence features is opened. Such a database would include a list of each feature's 
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attributes along with the name of the feature. For example, a feature name could be "Initiation 
Codon" and the attribute would be "ATG". Another example would be the feature name 
"TAATAA Box" and the feature attribute would be "TAATAA". An example of such a database is 
produced by the University of Wisconsin Genetics Computer Group (www.gcg.com) . 
Alternatively, the features may be structural polypeptide motifs such as alpha helices, beta sheets, or 
functional polypeptide motifs such as enzymatic active sites, helix-turn-helix motifs or other motifs 
known to those skilled in the art. 
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